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ABSTRACT 

It has been shown that spectroscopy of transiting extrasolar planets can potentiahy provide a wealth 
of information about their atmospheres. Herein, we set up the inverse problem in spectroscopic re- 
trieval. We use non-linear optimal estimation to retrieve the atmospheric state (pioneered for Earth 
sounding by Rodgers 1976, 2000). The formulation quantifies the the degrees-of-freedom and informa- 
tion content of the spectrum with respect to geophysical parameters; herein, we focus specifically on 
temperature and composition. First, we apply the technique to synthetic near infrared spectra, and 
explore the influence of spectral signal-to-noise ratio and resolution (the two important parameters 
when designing a future instrument) on the information content of the data. As expected, we find that 
the number of retrievable parameters increase with increasing signal-to-noise and resolution, although 
the gains quickly level off for large values. Second, we apply the methods to the previously studied 
atmosphere of HD 189733b, and compare the results of our retrieval with those obtained by others. 
Subject headings: planetary systems — planets and satellites: atmospheres — radiative transfer- 
methods: data analysis-planets and satellites: individual(IID189733b) 



1. INTRODUCTION 

Currently there are about 130 confirmed transiting ex- 
oplanets (www.exoplanet.org). Of these planets, sev- 
eral dozen have spectra that have been observed, either 
through broadband photometry from instruments like 
the Spitzer Infrared Array Camera (IRAC) (Knutson et 
al. 2007; 2008; Harrington et al. 2006; 2007; Steven- 
son et al. 2011) or higher resolution spectroscopy from 
the Hubble Space Telescope (HST) Near Infrared Cam- 
era and Multi-Object Spectrometer (NICMOS) (Swain 
et al. 2009a; 2009b), Spitzer Infrared Spectrometer 
(IRS) (Grillmair et al. 2007; 2008), and recently, from 
ground based instruments (Swain et al. 2010; Mandel 
et al. 2011). Although the spectra are of low resolution 
(i? = A/AA - 5-50) and low signal to noise (S/N < 10), 
they nevertheless provide useful information about the 
temperature and composition of the exoplanetary atmo- 
spheres (Tinetti et al. 2007; 2010; Madahusudhan & 
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Seager 2009; etc.). A typical approach to retrieving this 
information is to match the data set with forward models 
by manually tuning the model abundances and temper- 
atures, until a possible best fit is obtained (Tinetti et al. 
2007; 2011; Swain et al. 2009a; 2009b). This approach 
does not provide an optimal solution to the atmospheric 
state; furthermore, it can be cumbersome and is suscep- 
tible to multiple degeneracies (Tinetti et al. 2007; Sing 
et al. 2008; Madhusudhan & Seager 2009) 

Others have used multi-dimensional grid models to 
constrain atmospheric parameters (Madhusudhan & Sea- 
ger 2009), a method that is well tuned to systematically 
searching the parameter space given sparse data (as with 
Spitzer IRAC color photometry). In this approach, an 
ensemble of forward models are generated using up to 
10 gridded free parameters (6 to govern the shape of the 
temperature profile and 4 scaling factors for uniform mix- 
ing ratios of H2O, CH4, CO, and CO2); model families 
that best describe the data are selected based on a chi- 
squared statistic criterion. Because of the degeneracies 
between the different gases, and between gases and tem- 
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peraturc. thousands of solutions can exist within a given 
chi-squared region, thus only giving loose constraints on 
the atmospheric composition and temperature. Further- 
more, the formalism provides no easy way to explore the 
change in information content associated with a change 
in the data phase space (e.g., R or S/N). 

Here, we present the inverse approach (see also Lee et 
al. 2011) that determines the atmospheric "state" (i.e. 
its temperature structure and abundances) by minimiz- 
ing a cost function that simultaneously takes into account 
new measurements and prior knowledge of atmospheric 
properties (such as a state retrieved from previous obser- 
vations). Additionally we determine, within the context 
of our model, the quality of the spcc;tra and the number 
of useful retrievable atmospheric properties. This work 
represents the first attempt at determining the amount 
of useful information that can be retrieved from typical 
exoplanet spectra. Furthermore, this paper represents 
the first attempt at using information theoretic limits for 
retrievals assuming certain instrument capabilities (such 
as R and S/N). Ultimately, the theory is general and en- 
ables prediction of the advances that can be made with 
improvements in instrumentation and via more prudent 
choice of spectral ranges. 

In §2 we outline the basics of the classic retrieval the- 
ory of Rodgers (2000). We first test the technique on an 
artificial dataset and explore how the number of retriev- 
able parameters depends on R and S/N and discuss how 
these can be optimized to maximize the usefulness of a 
measurement in §3. We then apply these techniques to 
the well studied HD189733b dayside emission spectra in 
§4. This is followed by a discussion and conclusions in 
§5. 

2. METHOD 

2.1. Retrieval Theory 

The retrieval problem is well known in the field of 
Earth atmospheric studies (Rodgers 1976, Chahine 1968, 
Twomey 1977) and in studies of planetary atmospheres 
(see e.g., Nixon et al. 2007). The fundamental problem 
is to determine the state vector, x of dimension n, often 
a vector of temperatures and mixing ratios at different 
altitudes (but could be other desirable variables), given 
some set of observations, y of dimension m, usually a 
vector of flux values at each wavelength. In the absence 
of any noise, they can be related through y=F(x), where 
F(x) is a model that simulates the measurement at each 
wavelength given a representative atmosphere. In an ide- 
alized scenario, if the relationship between x and y is 
linear, we can linearize F(x) and write 



y = F(Xa) + K(x - Xa) 



(1) 



where K is the m x n Jacobian matrix whose elements 
are given by the Prechet derivative 



dxi 



(2) 



with Fi being the measurement in the i*'* channel, and xj 
the value of the j*'* parameter. The vector Xq is the prior 
(a priori) state, which represents our best initial guess 
of the true state before the observations are made. The 
Jacobian describes the sensitivity of the measurement at 



each wavelength in a spectrum to a perturbation of a 
given parameter in the forward model. If the lengths of 
X and y are the same then (1) may be readily inverted 
to 

x = Xa + K-i(y-F(xa)) (3) 

Real data are often noisy and usually have a large num- 
ber of measurements that over constrain the atmospheric 
state. For this we must use a more sophisticated scheme 
to invert the data to determine the atmospheric proper- 
ties. This can be readily achieved by using a Bayesian 
framework. In the remainder of this section, we present 
the basic formalism and useful equations and algorithms 
that we can use to retrieve atmospheric properties from 
spectra as well as their information content, following 
the derivations in Rodgers (2000). For further details, 
see either Rodgers (2000) or Jacob (2007). 
Bayes theorem can be written as 



P(x|y) a P(y|x)P(x) 



(4) 



where P(x) is the prior probability distribution, which 
is knowledge of the atmospheric state before making a 
measurement, P(y|x) is the likelihood function, that is 
the probability that the data exists within the context 
of a particular model, and P(x|y) is the posterior prob- 
ability distribution density function which can be inter- 
preted as the probability that some state x, in our case 
atmospheric state, exists given the observations, y. If we 
assume Gaussian probability distributions for the obser- 
vational error and for the a priori information, we can 
write 

P(y|x)0Ce-3(y-Kx)^S-i(y-Kx) (5-) 



P(x) oc e-5(^-^a)'^s,i(x-xa) 



(6) 



where Sg is the m x m, diagonal error covariance matrix 
(assuming no correlation between measurements) and Sa 
is the n X n a priori covariance matrix. The a priori 
covariance matrix represents our prior knowledge of the 
natural variability of the system and like Sg, it is assumed 
to be diagonal. It essentially defines our "trust" region, 
or how far from the prior state we think the actual state 
can exist. In general, the prior constraint should be loose 
enough to allow flexibility in the retrieval but not so loose 
that the retrieval fails when a measurement contributes 
no information. 

Using Bayes theorem from (4) we can write the pos- 
terior probability distribution as a product of (5) and 
(6) 

P(x|y)ae-5^W (7) 
where J(x) is the cost function and is given by 
J(x) = (y-Kx)^S-^(y-Kx) 

+ (x-Xa)^S-l(x-Xa) (8) 

The first term in the cost function represents the contri- 
bution from the data. The second term represents the 
contribution from the prior knowledge. If the data is of 
good quality (high S/N, and high R) then the data term 
will dominate. Since the product of two Gaussians is a 
Gaussian, equation (8) can be equivalently written as 



J(x) = (x-x)'^S-^(x-x) 



(9) 
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where x and S are the mean and covariance, respectively, 
of the posterior probability distribution. A diagonal el- 
ement of S is the variance in the j*'* component of the 
state vector, Sjj = a'j, where dj is the retrieval uncer- 
tainty in the parameter. 

The goal of any retrieval is to obtain the most likely 
set of atmospheric parameters given the data. This is 
achieved when (7) is maximized which occurs at the mean 
of the posterior probability function. Equating (8) and 
(9) we can solve for x and S to get 

X = xa + G(y - Kx) (10) 

where G is the gain matrix that describes the sensitivity 
of the retrieval to the observations (if G=0, no sensi- 
tivity, then the measurements do not contribute towards 
the retrieved state) , given by 



G 



dy 



SK^s; 



with 



S = (KTS-^K + Sr')-' 



(11) 



(12) 



As the elements of Sa approach oo or the elements of Se 
approach 0, then G approaches K"-*^ which is identically 
the sensitivity of the state vector to the observations, and 
thus the retrieval is fully characterized by the data. 

If the forward model is linear, then (10) can be solved 
to obtain the desired state vector. Often, the forward 
model is non-linear, generally the case in radiative trans- 
fer; it is then best to use a numerical iteration scheme to 
determine the state vector. In the non-linear case the Kx 
terms in the cost function in (8) arc replaced with F(x). 
The Levenberg-Marquardt iteration scheme is used to 
find the minimum of the non-linear cost function. The 
prescribed scheme is given by 

Xk+i = Xk + [(1 + 7)Sa ' + KjS-^Kk]-^ 

{K JS-i [y - F(xk)] - [xk - xj} (13) 

where Xk and Xk+i arc the state vectors for the fc*'' and 
k + 1** iterations, and Kk is the Jacobian matrix cal- 
culated at the A;*'' iteration. 7 is a factor that controls 
the rate of convergence and is adjusted at each iteration 
(Press et al. 1995). Equation (13) is iterated until con- 
vergence, when 

(xk-Xk+i)'^S-^(xk-Xk+i) «n (14) 

Upon convergence, we obtain the retrieved state, x and 
its precision S. 

2.2. Information Content & Degrees of Freedom 

The information content (Shannon & Weaver 1962) 
and total number of degrees of freedom are useful quan- 
tities that can help diagnose the quality and ability of a 
spectral data set to contribute to our knowledge of the 
atmospheric state. The number of degrees of freedom 
represents how many independent parameters can be re- 
trieved from the spectrum, and the information content 
is a metric of how much the precision in the retrieved 
parameters has improved as a result of the observation. 
In the simplest sense, if there are m independent mea- 
surements with no error (eg, fluxes at m different wave- 
lengths), then there will be at most be m independent 



pieces of information (degrees of freedom) that can be 
obtained from the observations. If m is fewer than the 
number of model parameters, n, the exact values of n—m 
parameters cannot be obtained from the observations. 
We do not discuss those cases in this article, we choose 
only cases for which m> n. For a given forward model, 
with n parameters, the maximum number of obtainable 
degrees of freedom will be the smaller of n and m. In 
an ideal case the total number of degrees of freedom will 
be close to n, meaning that the observations can be fully 
characterized by those n parameters. 

In reality, measurements are susceptible error, and the 
total number of degrees of freedom in the observed sig- 
nal (denoted by rfg), and thus the number of parameters 
accessible to our retrieval, may be fewer than the num- 
ber of independent measurements, n. Some degrees of 
freedom, (i„, can be lost in the noise . The sum of dg and 
dn must add up to the total number of parameters we 
are seeking, n. 

Before calculating the degrees of freedom it is useful 
to first introduce the averaging kernel, A. The averaging 
kernel tells us which of the parameters in the state vector 
have the greatest impact on the retrieval, that is, the 
sensitivity of the retrieval to a given parameter, given by 



9x 
9x 



9x dy 
dy dx 



GK 



A is an n X n matrix whose elements are given by 

dxi 



(15) 



(16) 



If a diagonal element of A is unity, or close to it, then 
that means for a given change in the true atmospheric 
state, there is identically the same change in the retrieved 
state. This suggests that the parameter, Xj , is fully char- 
acterized by the data. If that diagonal element is less 
than unity, meaning that the data itself is not of a high 
enough quality to constrain that parameter, then some 
fraction of the a priori information must have been used 
in determining the value of that parameter. If each pa- 
rameter is fully characterized by the data, that is if, all 
of the diagonal elements of A are unity, then we would 
expect to be able to retrieve all n parameters. If the di- 
agonal elements are less than unity, then the sum of the 
diagonals would be less than n. In essence, the diago- 
nal elements of the averaging kernel can be thought of as 
the degrees of freedom per parameter. If the value of a 
particular diagonal clement is 1, then that parameter is 
well characterized by the data. If it is much less than 1, 
then the data contributes little to our knowledge of that 
parameter. The total degrees of freedom from the signal 
can be determined by calculating the trace of A. The 
difference between n and the trace of A is the number of 
degrees of freedom lost to the noise. 

The total degrees of freedom, again, tell us how many 
independent parameters can be determined from the ob- 
servations. The information content, i?, tells us quan- 
titatively how well the observations increased our confi- 
dence in our estimate of the atmospheric state relative to 
the a priori knowledge. In a more precise language, the 
information content of a measurement is the reduction in 
the entropy of the probability that that an atmospheric 
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state exists given some set of observations, or 

H — entropy{P{x)) — entropy {P{x.\y)) (17) 

The entropy of a Gaussian distribution of width cr, which 
the prior and a posterior distributions are assumed to be, 
can be shown to be proportional to ln{a). Using this fact, 
and equations (17), (6), and (9), 

H=^-ln{\S-'S^\) (18) 

From this we can see that if the data is good (small error 
bars), then the elements of S will be small, resulting 
in a large H. Thus H is a. quantitative measure of the 
reduction in our uncertainty in the retrieved atmospheric 
state as a result of the observations. The larger the value 
of H, the more useful the observations are in constraining 
the atmospheric state. 

In summary, both ds and H are quantitative measures 
of the quality and usefulness of the observations in de- 
termining the atmospheric state, within the context of 
a givcni forward model. From their definitions we would 
expect that a spectrum with a higher S/N, or a higher 
R, would result in higher values. We will show this in 
section §3. 

2.3. Forward Model 

A relatively simple forward model, F(x), which 
nonetheless captures tlic basic pliysics and the measure- 
ment process, is at the core of our retrieval. We assume 
a simplified understanding of the physical and chemi- 
cal state of the exoplanet atmosphere, i.e., a parame- 
terized temperature structure, the major volatile con- 
stituents, the important radiative processes, and the in- 
strument line profiles etc. Our forward model, as most 
such models, is an approximation because the data are 
of limited quality, the; midcrlying physics is relatively ill- 
understood, and simplifying approximations are neces- 
sary. Examples of physics missing in our F(x) include 
absent species, inaccurate line lists, clouds, aerosols, 3D 
effects etc., or possibly insufficient parameterization of 
the atmosphere. Therefore, our retrievals must be taken 
in context of our chosen forward model. Herein, we 
only consider the dayside spectra of hot-Jupiters with 
near solar metallicity, though the methods are easily be 
extended to other kinds of observations (transmission 
spectra) and exoplanets (hot-Ncptunes, mini-Ncptunes, 
super-Earths etc.) with relatively minor modifications to 
the forward model. For future instruments, with broader 
spectral coverage and higher spectral resolution, the for- 
ward models can increase in sophistication. 

Lacking sufficient data (these arc low signal-to-noise, 
low resolution spectra), we simplify our atmosphere to 
8 parameters tliat characterize the temperature struc- 
ture and gas concentrations. For sake of simplicity, we 
use an analytic temperature profile formulated by Guillot 
(2010), and since then modified by Parmentier & Guillot, 
(in preparation) to include three channels. The profile, 
derived using a 3 channel approximation, is given by 

T'{r) = ^(-+r) + ^(l-a)e^,(T) + ^a^,,(r) 

(19) 



where 

(20) 

with 7i = KyjKiR and 72 = Kv^/kir, where k^^, k^^, 
and nin are the visible and infrared (thermal) opacities, 
respectively. The parameter a (range 1 to 0) partitions 
the flux between the two visible streams, and E2{"fT) 
is the second order exponential integral function. The 
internal heat flux (from the net cooling history) is rep- 
resented by the temperature Tint, while the solar flux at 
the top of the atmosphere is represented by Tj^ri these 
two temperatures are fixed. Assuming zero albedo and 
unit emissivity, Tj^r is 

Tirr = i^r^'n (21) 

where i?* and arc the stellar radius and temperature, 
a, the star planet separation and r is the infrared (ther- 
mal) optical depth 

r='^ (22) 

with P the pressure and g the surface gravity (at 1 bar). 
In total there are 4 free parameters governing the tem- 
perature structure, k/tj, k^-^, k^^ and a. We choose this 
parameterization with two visible streams as opposed to 
tlie traditional one visible stream (Hansen 2005; Guillot 
2010) because the extra stream allows more freedom for a 
temperature inversion, though in some cases (as we shall 
see below) the second visible stream does not matter. 

The remaining 4 parameters are the uniform mixing 
ratios for H2O, CH4, CO, CO2, expected to be the ma- 
jor molecular opacity sources (Tinetti et al., 2007; Swain 
et al., 2009a). We choose vertically uniform mixing ra- 
tios for two reasons. First, the data lack sufficient in- 
formation content to actually help resolve vertical struc- 
ture in abundances, and second, chemical kinetics models 
(Moses et al. 2011; Line et al. 2010, 2011), show that 
vertical mixing leads to constant vertical mixing ratios 
for these species within the IR photosphere, so even if 
we could resolve detailed vertical information, we would 
most likely find that the abundances remain fairly con- 
stant. 

Since many of these parameters may vary over many 
orders of magnitude we find it convenient with the above 
formalism to solve for the logarithm of the atmospheric 
state. With that, the state vector of parameters that we 
would like to retrieve can be given by 

log(Kri) 
log(Kt>l) 
\0g{KlR) 

_ a 

^ ^ l0g(/H20) 
log(/cH4) 

log(./co) 

. I0g(./C02) . 

where fi is the mixing ratio of species i in parts per 
million (ppm) and tlie opacities are in cm^g~^. 

We also include H2-H2 and H2-He collision induced 
opacity. The mixing ratios of H2 and He vary little with 
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the atmospheric levels that produce the bulk of the day- 
side thermal emission (500-2000 K, lO-lO""* bar). We 
fix and fne to thermochemical abundances (assum- 
ing solar elemental abundances) of 0.86 and 0.14, respec- 
tively. These values may change on the tens of percent 
level in enriched atmospheres, however, this variation has 
negligible effect on the resultant infrared spectra. Also, 
we do not include NH3 as an opacity source as it has 
little influence in the spectral region we consider. 

We use the Reference Forward Model (RFM)r] a line- 
by-line radiative transfer code, to calculate the disk inte- 
grated dayside emission spectra, modified to handle H2- 
H2 and H2-He coUisionally induced opacities. The colli- 
sionally induced opacity tables are taken from Barysow et 
al. (2001;2002) and J0rgensen et al. (2000). The molec- 
ular line strengths for H2O, CO2, and CO, are from the 
HITEMP (Rothman et al. 2010) database and CHj^lis 
from the HITRAN 2008 database (Rothman et al. 2009). 
In order to keep the molecular line-lists from becoming 
too unwieldy we make an intensity cutoff at 298 K of 
10^'**' cm molecule"^, as recommended by Sharp & Bur- 
rows (2007). 

3. TEST ON SYNTHETIC DATA 

First, we test the retrieval method on a synthetic data 
set for which we know the answer. Using this synthetic 
spectrum, we explore the effect that signal-to-noise and 
spectral resolution have on the degrees of freedom and 
information content. 

A hypothetical hot-jupiter atmosphere is generated us- 
ing — = A X 10~^ cm^g~^, Kjji = 1 X 10~^ 
cm^g^^, a ~ 0.5, and fixed vertical mixing ratios of 
/if 20 - 5 X 10-4, fcHi = 1 X 10-6, /co = 3 X 10-4, and 
fc02 = 1 X 10 ^. The planet orbits around a GOV host 
star (e.g. HD 209458a) with T, = 6000 K, i?, = 1.14 R© 
at a separation of a = 0.064 AU. The planetary proper- 
ties are a radius of 1.35i?j, an internal temperature of 
Tint = 200 K, and g = 21.1 m s"^ (at 1 bar pressure). 
Using (21) we find T^^r = 1223 K. The emission spectrum 
of the exoplanet (see Figure 1) is initially generated with 
a one wave-number resolution (resolving power, R ~5000 
at 2 /im). 

For the initial test, the synthetic spectrum (Figure 1) 
is degraded by convolving it with an instrumental profile 
matching the defocussed HST NIC3 camera with a spec- 
tral full width at half maximum of 0.055 /im (R ~ 40 at 
2 fim; Swain et al. 2009a), and reducing the measure- 
ment signal-to-noise of each spectral channel to ~ 10. 
Rather than be guided by physical and chemical models, 
or some previous observation of the object, we arbitrar- 
ily chose an a priori state, Xa, far from the true physical 
state. The remaining unspecified quantity is the a pri- 
ori covariance matrix, Sa. Once more, the diagonal ele- 
ments of Sa are allowed a large range as we are dealing 
with a relatively novel type of observations and lack de- 
tailed prior information. We also assume that there are 

^ see 'http://www.atm. ox. ac.u k/RFM/l 

^ Upon completion of our initial investigation it was also brought 
to light that there exists more appropriate high temperature 
based line lists for methane such as the STDS (http; / /icb.u- 
bourgogne.fr/OMR/SMA/SHTDS/HTDS.html). Using this line 
list over HITRAN makes absolutely no difference for our synthetic 
work since the synthetic data was produced using the HITRAN 
methane. We have also compared our HD189733b retrieval results 
for both methane line lists and found no difference. 
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Fig. 1. — Synthetic spectrum (bottom) generated with the model 
atmosphere (top) with a spectral resolution of 1 cm~^, or R~5000 
at 2 fim. The model temperature profile is generated from equa- 
tions (19) and (20) with Kv-^ = k„2 = 4 x 10"'^ cm^g"-"^. Km = 
1 X 10-2 cm^g-i, a = 0.5, Tirr = 122ZK, and Ti„t = 200^". 
The constant-with-altitude mixing ratios are fH20 = 5 X 10~^, 
fcH4 = 1 X 10-6, /co = 3 X 10-^, and fco2 = 1 x lO'^. 

no cross correlations between different state parameters 
(e.g. fco and jcOi^ even though from chemical mod- 
els we know that such quantities have high correlations). 
Because the state parameters are logarithmic, the ele- 
ments of Sa are also logarithmic (with the exception of 
a) so we set, somewhat arbitrarily, cr^^j = 2, (Tk„2 = 2 , 
f^K/B = 2 , cr„ = 0.5 , cr/„2o = 6 , (T/cff4 = 6 , cr/^o = 6, 
and Oj^Q2 = 6 meaning that the opacities are permitted 
to span 4 orders of magnitude centered around their a 
priori value and the mixing ratios are allowed to span 12 
orders of magnitude. Such large a priori uncertainties 
lead to a flat a priori distribution, relative to the data, 
reducing the current problem to a maximum likelihood 
estimation (as opposed to Bayesian), with the option of 
using the priori information if the data is sparse. 

The entirety of the forward model can summarized 
with the Jacobian. Figure 2 shows the columns of the Ja- 
cobian evaluated at the true state (response of the flux in 
each channel to a perturbation in each of the parameters 
in x) for the synthetic data (Figure 3). The spectrum 
is most sensitive to perturbations in the opacities that 
govern the temperature profile. The 1.7 \iva. and 2.2 [iva. 
channels are most sensitive to changes in the temperature 
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profile. This is because there aren't large absorption fea- 
tures at these wavelengths, meaning, these channels are 
most sensitive to the flux from deeper layers (1-10 bars). 
This also partially explains why kjr and Kyi have oppo- 
site responses. An increase in kjji results in an increase 
in flux due to an increase in temperature in the deep 
layers probed by these channels, as can be seen in (19). 
An increase in Kyi results in a decrease of flux in these 
channels due to a decrease in temperature in the deeper 
layers. From (19) an increase in k„i increases the tem- 
perature above the ~ O.lbar level, and in order to main- 
tain radiative equilibrium at the top of the atmosphere, 
a decrease in temperature in the deeper layers must oc- 
cur, and also a higher Ky prevents the stellar flux from 
penetrating into the deeper atmosphere. The opposite is 
true near 2.9 fim which is more sensitive to higher alti- 
tudes because of the large absorption, thus an increase 
in in Kyi will result in an increase in temperature which 
in turn results in a flux increase. Also, in this particular 
case a = 0.5 meaning both Kyi and have identically 
the same results. Additionally, Kyi—Ky2 which causes the 
spectrum to have no sensitivity to changes in a. 

The spectral response is most sensitive to the water 
abundance more than any other gas across all wave- 
lengths in this example (Figure 2). This makes the re- 
trieval of water more precise than the other species. The 
greatest sensitivity to changes in the CO2 abundance oc- 
cur at 2.1 and 2.8 /im, which both happen to be located 
near the sensitivity minima of CO and CH4, though it 
still has to contend with water. Both CO and CH4 have 
greatest sensitivity in the 2.3 /xm band making it difficult 
to simultaneously retrieve both. 

Figure 3 shows the retrieval process for this initial syn- 
thetic test case. We determine the quality of the retrieval 
using the standard reduced chi-squared given by 

i—l * 

where N is the total number of data points, j/i, Fi, and 
CTi, are defined in §2.1. If is less than one, then the 
difference between the model fit and data is typically 
better than 1 a. We should stress however, that a perfect 
fit (x^ = 0) does not necessarily mean that the true state 
has been retrieved, because of the degeneracies between 
some of the parameters. Table 1 compares the true state 
to the retrieval results along with the retrieval precission. 
The synthetic retrieval demonstrates the robustness of 
the retrieval to a poor a priori. The reason for this can be 
seen by inspecting the elements of the averaging kernal. 
From Table 1, all but Kyi and methane are fairly well 
characterized by the data [Ajj is close to 1). Summing 
these values gives the total degrees of freedom, and thus 
the total number of useful retrievable parameters of ~ 6. 



3.1. Resolution and Signal to Noise Effects on the 
Degrees of Freedom & Information Content 

The S/N and R are two important factors that in- 
fluence the quality and usefulness of a spectrum. It is 
thus imperative to consider them when designing a spec- 
trometer. In this section we use our synthetic dataset 
to explore how the degrees of freedom, both total and 
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Fig. 2. — Columns of the Jacobian for the synthetic spectrum 
evaluated at the true state. This is the response of the flux as 
a function of wavelength due to a small positive perturbation in 
one of the parameters in x. The top panel is the flux response for 
the parameters that govern the temperature profile, k„j , i f^lR- 
The bottom panel is the flux response to a small perturbation in the 
gas mixing ratios, fH20, fcH4, fco, and fc02- The Jacboian is 
calculated as a change in the planet-to-star flux ratio, A{Fp / F^) to 
a positive logarithmic perturbation in a given parameter, A log{a::j ) . 
Note that in the bottom panel an increase in the gas mixing ratios 
always results in a decrease in Fp/F^. In this particular case, the 
spectrum is equally sensitive to and Kv-^ because a is 0.5. If 
= than the spectrum will have no sensitivity to k„2 and if 
= 1 the spectrum will have no sensitivity to Kv^. Also, for this 
synthetic dataset which results in no sensitivity to a 

per atmospheric parameter, and the information content 
evolve with increasing S/N and R. 

We would intuitively expect dg and H both to increase 
with increasing R and S/N. Figure 4 shows a contour 
plot of ds and H calculated for the synthetic spectrum 
generated in Figure 1 for a variety of S/N's and R's. The 
maximum increase in both occurs with a simultaneouf|f] 
increase in S/N and R. 

We point out that the contour plots in Figure 4 can 
only be taken in the context of the spectral window 
within which we are applying the retrieval, and the num- 
ber of parameters we are trying to retrieve. In other 
words, for the 8 parameters we are retrieving here, there 
is no benefit to increasing R or S/N beyond a few hun- 
dred and ^100, respectively. If we do happen to have a 
higher R and S/N, it is likely that we would be able to 
retrieve more forward model parameters such as the con- 
centrations of other gases, or information on the vertical 

* This is true if R and S/N are independent of each other. In 
most cases S/N decreases with increasing R because of the smaller 
spectral bins. 
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Wavelength [/^m] T [K] 

Fig. 3. — Synthetic spectrum retrieval. Left: Iteration sequence of the model spectrum, F{xk). The diamonds with error bars are the 
synthetic data convolved down to a resolution of 0.055 fj,m (R~37 at 2 fim) and a signal-to-noise of 10. The thick red curve is the forward 
model spectrum generated from the a priori, F(xa). Note that it is a poor fit to the data. Each subsequent curve is the new model 
spectrum after each iteration of equation 13. The thick solid blue curve is the final retrieved model spectrum. Right: Evolution of the 
temperature profile with each iteration. The thick red curve is the a priori temperature profile. The thick blue curve is the retrieved 
temperature profile. The diamond symbol curve is the true temperature profile as in figure 1. converges to 0.007 after 8 iterations of 
equation 13. 



TABLE 1 

Synthetic retrieval results. Kv2, and kjh are in units of (cm^g^^). fi is the volume mixing 

RATIO FOR SPECIES i. We ALSO SHOW THE DIAGONAL AVERAGING KERNAL ELEMENTS (Ajj = ) FOR EACH 
PARAMETER. THE RETRIEVAL UNCERTAINTIES ARE GIVEN AS X — 6" TO X + 6" FOR EACH PARAMETER. 



Parameter 


True State (x) 


A priori { 




Retrieved State (x) 


Retrieval Precision 




dxj 




4.00x10-3 


1.00x10" 


-3 


3.59x10-3 


2.76x10-3 - 4.68x10" 


3 


0.997 




4.00x10-3 


l.OOxlQ- 


-2 


1.70x10-^ 


1.70x10-" - 1.70x10 


-7 


0.0 




1.00x10-2 


3.16x10" 


-2 


8.93x10-3 


7.13x10-3 - 1.12x10- 


2 


0.998 


a 


0.5 


0.1 




0.003 


0.00 - 0.022 




0.999 


fH20 


5.00x10-"' 


1.00x10" 


-6 


4.18x10-"' 


2.58x10-'' - 6.76x10- 


4 


0.999 


fcH4 


1.00x10-'^ 


1.00x10" 


-4 


3.43x10-'' 


4.34x10-12 - 2.70x10 


-2 


0.334 


fco 


3.00x10-4 


1.00x10" 


-6 


1.96x10-" 


2.27x10-^ - 1.69x10- 


2 


0.896 


fC02 


1.00x10-^ 


1.00x10" 


-4 


7.70x10-'^ 


9.95x10-1'' - 5.96x10 


-4 


0.768 



distributions of the gases. Current observations, like the 
HST NICMOS observations of HD189733b, generally fall 
towards the bottom left corners in Figure 6. This sug- 
gests that S /N and R's of such data are not high enough 
to fully constrain even our simple forward model, and 
thus even less constraining for more complicated models. 

The increasing behavior in with increasing S/N can 
be seen through the use of (11), (12), and (15). As S/N 
goes to infinity, the elements of Sg go to zero causing G to 
approach K""-*", in turn causing A to approach the iden- 
tity matrix, meaning the diagonal elements are all ones 
with a trace equal to the total number of parameters 
and thus the maximum number of degrees of freedom. 
The relationship between ds and S/N can be seen in a 
1-parameter 1-channel model, where ds = A. Upon re- 
ducing the matrix equations, the one element averaging 
kernel becomes, 

d =A= ^ (^/^)^ (24) 

and the relation of these parameters to the information 



content is 

2 

H = ln[l + ^K''[S/Nf]. (25) 

where K, CTq, and F are the 1-D analogs for K, Sa, and 
F(x), respectively. We also have assumed that Ce, the 
1-D analog for Se, is the flux, F, divided by S/N. In 
this case, ds approaches unity as S/N goes to infinity, 
and zero, if S/N is zero. H approaches infinity as S/N 
goes to infinity, and approaches zero when S/N goes to 
zero. One important thing to note from these relations 
is that increasing S/N will matter only if the Jacobian, 
K, is non-zero, meaning that there must be some sensi- 
tivity of the fiux to a perturbation in the desired param- 
eter. Otherwise, no amount of S/N increase will improve 
our knowledge of the atmospheric state. Increasing R or 
adding more spectral channels can also contribute to an 
increase in dg and H. If channels are chosen such that the 
K is large, meaning large sensitivity to a given parame- 
ter, then ds and H will both increase. As K approaches 
infinity (infinite sensitivity), ds will approach unity and 
H will approach infinity. 
From this simple analysis, though it may intuitively 
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Fig. 4. — S/N and R effects on the total degrees of freedom (left) and the information content (right). In general, as S/N and R increase, 
the total number of degrees of freedom obtainable from the data, and the information content increase. See equations (24) and (25). 



obvious, we can readily see that if we want to improve 
the characterization of a particular atmospheric property, 
it is best to design an instrument whose spectral regions 
offer the greatest sensitivity to that parameter, and to 
have a high S/N within those spectral regions. 

4. TEST ON REAL DATA: HD189733B DAYSIDE EMISSION 

Now that we have demonstrated that this retrieval pro- 
cedure works and provides useful information about the 
quality of a data set through the degrees of freedom and 
information content, we wish to apply it to the dayside 
emission spectra of one of the best-studied exoplanet at- 
mospheres, HD189733b. We assume the same forward 
model and a priori covariances as in the synthetic work. 

The dayside emission spectrum of HD189733b has been 
subject to much investigation (Swain et al., 2009a, Grill- 
mair et al. 2007, Madhusudhan & Seager 2009, and many 
others), and often times different analyses come up with 
different solutions for its composition and temperature 
structure. For simplicity we investigate only the near IR 
spectrum from Swain et al. (2009a). As an a priori at- 
mospheric state we use the "Fortney 27r" (Fortney et al., 
2010) temperature profile from Figure 2 of Moses et al. 
(2011) approximated with equation (19) and the 0.1 bar 
mixing ratios for H2O, CH4, CO, and CO2 from their ta- 
ble 2 but assumed to be constant with altitude within the 
IR photosphere sampled by the observations (because of 
quenching arguments). Figure 5 and Table 2 show the 
results of the retrieval. The Jacobian in Figure 5 demon- 
strates the high sensitivity of the spectrum to water and 
carbon dioxide, some sensitivity to CO near 2.3 /im, and 
very little sensitivity to methane at all wavelengths. The 
1.7 and 2.2 fim channels are sensitive to the deep tem- 
peratures (effected by kjr) due to the higher transmit- 
tance at those wavelengths. The strong CO2 absorption 
feature at 2.1 fj,m has less sensitivity to the deep temper- 
atures and more sensitivity to temperatures higher up 
(controlled by Kyi and Kv2)- 



The diagonal elements of the averaging kernel in Ta- 
ble 2 quantitatively tell us which parameters we can 
and cannot retrieve from the dayside emission spectra. 
Again, H2O, CO and CO2 have averaging kernel elements 
that are near unity and are therefore well constrained by 
the data, as is also reflected in the retrieval uncertainty, 
which is smaller than the assumed a priori uncertainty. 
CH4 is completely unconstrained. The retrieval uncer- 
tainty is the same as the a priori uncertainty, suggesting 
that the observations contribute no information about its 
abundance. The trace of the averaging kernel gives the 
total number of degrees of freedom, and thus the total 
number of retrievable parameters, to be ^^5. 

Our results compare quite well with those of Mad- 
husudhan & Seager (2009) and with Swain et al. (2009a) 
with the exception of CO2 (Table 2) which appears to be 
underestimated by three orders of magnitude in Swain 
et al. (2009a). Our derived temperature profile (Figure 
5, bottom right) also appears to fall within the spread 
given in Figure 5 of Madhusudhan & Seager (2009). 



5. DISCUSSION & CONCLUSIONS 

We demonstrate retrieval by inverse modeling of ex- 
trasolar planetary spectra. We first apply the technique 
to a synthetic model spectrum of a solar metallicity 
T ~ 1200 K hot Jupiter, and then to a previously pub- 
hshed HST NICMOS spectrum of HD 189733b showing 
results that are consistent with previous studies. The 
approach herein is much more efficient that other meth- 
ods such as a griddcd parameter search, or Monte-Carlo 
techniques, as it only requires ^ 10^ forward model com- 
putations as opposed to millions. The formalism also 
allows robust estimation of the retrieval uncertainties. 

We have also investigated the information theory as- 
pects of the problem, in order to assess the quality and 
usefulness of a spectral data set in constraining atmo- 
spheric properties. First, we discuss how the Jacobian 
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Fig. 5. — Retrieval results for the NICMOS dayside emission spectra of HD189733b from Swain et al. (2009a). Top Left: The sensitivity 
of the planet-to-star flux ratio to a perturbation in the mixing ratios of H2O, CO2, CO, and CH4 at each channel in the NICMOS dataset. 
Top Right: The sensitivity of the planet-to-star flux ratio to a perturbation in the parameters governing the temperature profile. Bottom 
Left: The retrieved spectrum. The black diamonds with error bars are the Swain et al. (2009a) dayside emission data. The red curve is the 
a priori spectrum convolved with the instrumental broadening profile and sampled at the data wavelengths. The orange curve is retrieved 
spectrum at high resolution. The blue dots are the retrieved spectrum convolved with the instrumental broadening function and sampled 
at the data wavelengths. This optimal solution gives x^=0.76. Bottom Right: The a priori (red) and retrieved (blue) temperature profiles. 



TABLE 2 

Retrieval results for HD189733b. Ky2, and k/^ are in units of (cm^q-I). is the volume mixing ratio for 

SPECIES i. We also show the diagonal averaging KERNAL elements (Ajj = ^^) FOR EACH PARAMETER. THE TOTAL 

NUMBER OF DEGREES OF FREEDOM FOR THIS SPECTRUM IS ~5. THE RETRIEVAL PRECISIONS ARE GIVEN AS X — (J TO £ -|- (T FOR 
EACH PARAMETER. We ALSO SHOW FOR COMPARISON THE ABUNDANCES DERIVED BY MADHUSUDHAN & SeAGER (2009) (MSIO) 

AND Swain et al. (2009a) (S09a). 



Parameter 


A Priori (xq) 


Retrieved State (x) 


Retrieval Precision 




dxj 


MSIO 


S09a 


Kyi 


4.00x10"^ 


4.71x10^3 


1.67x10"* - 1.32x10" 


1 


0.475 






Kv2 


4.00x10-3 


4.71x10-3 


1.67x10-* - 1.32x10" 


1 


0.475 








3.00x10-2 


4.70x10-2 


3.00x10-2 - 7.36x10- 


2 


0.990 






a 


0.5 


0.5 


0.00 -1.00 




0.00 








4.00x10-"' 


1.19x10-" 


5.29x10-5 - 2.67x10" 


4 


0.997 


~ 10-* 


1x10-5 - 1x10-* 


fcHi 


1.00x10-6 


9.78x10-9 


9.79x10-1^5 - 9.77x10 


-3 


0.00 


<6xlO-'5 


<lxlO-^ 


fco 


S.OOxlO-"* 


1.15x10-2 


3.60x10-3 - 3.64x10" 


2 


0.993 


2x10-* - 2x10^2 


1x10-* - 3x10-* 


fC02 


1.00x10"^ 


3.37x10-3 


1.69x10-3 - 6.72x10" 


3 


0.998 


7x10-* 


1x10-^ - IxlQ-^ 
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matrix can be used to determine which spectral chan- 
nels are most sensitive to chosen atmospheric parame- 
ters. Second, we show the use of the averaging kernel as 
a diagnostic tool to guide us to which parameters can be 
usefully retrieved from the spectrum in question. Third, 
we calculated the number of available degrees of freedom 
and often found that, given the current limited observa- 
tional capabilities, the number of retrievable parameters 
was less than the number of parameters in our forward 
model. Fourth, using simple expressions for the degrees 
of freedom and information content, we showed semi- 
quantitatively how S/N and R effect our knowledge of 
the atmospheric state. These tools can be particularly 
useful in aiding the design of future instruments such 
that they can be optimized for observations of transiting 
exoplanets. 

A recent paper (Lee et al. 2011) using the optimal es- 
timation approach as applied to HD 189733b, was pub- 
lished while this article was in preparation. The details 
of the methodology in that paper are somewhat different 
from ours, i.e. in the parameterization of the atmospheric 
models and in the use of the correlated-K opacities (we 
use line-by-line radiative transfer). In addition, Lee et 
al. use multi-band (i.e. from various instruments inclu- 
sive of HST NICMOS, Spitzer IRAC, IRS and MIPS), 
multi-epoch measurements of HD 189733b as a represen- 
tative snapshot of the planetary dayside. We restrict our 
retrieval to a single epoch, 13 spectral-channel NICMOS 
observation spanning less than one octave of total spec- 
tral coverage between 1.45-2.5 microns. Our retrievals 
agree for the most part with those of Lee et al. , in that 
H2O and CO2 are retrieved with confidence but neither 
retrieval can say much about the abundance of methane 
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(a trace species in HD 189733b). One clear discrepancy 
is that we are able to retrieve CO where as they can- 
not. Also, Lee et al. do not discuss the information 
content aspects of the atmospheric retrieval formulation 
presented in both of these papers. 

In follow on investigations, we plan to use the infor- 
mation content analyses to study aspects of combining 
Spitzer broadband photometry with prior notions about 
the atmospheric state to constrain atmospheric proper- 
ties such as CH4/CO and C/0 ratios. A powerful use of 
these methods is in optimizing the design of instruments 
that could be flown in NASA's FINESSE and ESA's 
Exoplanet Characterization Observatory, or in studying 
the potential of already designed instruments such as 
JWST's NIRCAM that offer various observing modes, 
bandpasses and spectral resolving power. 
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